Skip to content

New debug draw extension for AABBs #900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

keptsecret
Copy link
Contributor

No description provided.

Comment on lines +17 to +18
public:
struct SCachedCreationParameters

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more indent please

#include "nbl/builtin/hlsl/shapes/aabb.hlsl"
#include "nbl/ext/DebugDraw/builtin/hlsl/common.hlsl"

namespace nbl::ext::debugdraw

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either DebugDraw or debug_draw

Comment on lines +1 to +2
#ifndef _DRAW_AABB_COMMON_HLSL
#define _DRAW_AABB_COMMON_HLSL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

full header guard, with all namespaces prefixed

Comment on lines +15 to +19
#ifdef __HLSL_VERSION
float32_t3x4 transform;
#else
float transform[3*4];
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpp_compat gives you float32_t3x4 in C++ too

Comment on lines +25 to +29
#ifdef __HLSL_VERSION
float32_t4x4 MVP;
#else
float MVP[4*4];
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again

struct PSInput
{
float32_t4 position : SV_Position;
float32_t4 color : TEXCOORD0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you using HW attributes for color? the color is per-instance

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah these are Vx-Px inter-stage shenanigans, I'll allow, make sure you label color flat though

@@ -0,0 +1,13 @@
#pragma shader_stage(fragment)

#include "common.hlsl"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nbl/ext/DebugDraw/builtin/hlsl/common.hlsl is preferred

PSInput output;

float32_t3 vertex = (bda::__ptr<float32_t3>::create(pc.pVertexBuffer) + glsl::gl_VertexIndex()).deref_restrict().load();
InstanceData instance = vk::RawBufferLoad<InstanceData>(pc.pInstanceBuffer + sizeof(InstanceData) * glsl::gl_InstanceIndex());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to use vk::BufferPointer instead of RawBufferLoad

Comment on lines +20 to +25
float32_t4x4 transform;
transform[0] = instance.transform[0];
transform[1] = instance.transform[1];
transform[2] = instance.transform[2];
transform[3] = float32_t4(0, 0, 0, 1);
float32_t4 position = mul(transform, float32_t4(vertex, 1));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a util in math::linalg to carry out a fast multiple like that, see how I use it in my simpleDebugRenderer's shaders in Examples-Tests sumbouldes' common folder

transform[2] = instance.transform[2];
transform[3] = float32_t4(0, 0, 0, 1);
float32_t4 position = mul(transform, float32_t4(vertex, 1));
output.position = mul(pc.MVP, position);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the instance data is emphemeral/transient (uploaded just before the draw) can you not pre-multiply the camera MVP with the instance transform on the CPU ?

Such that each instance stores combined MVP*transform ?

// records draw command for single AABB, user has to set pipeline outside
bool renderSingle(video::IGPUCommandBuffer* commandBuffer);

bool render(video::IGPUCommandBuffer* commandBuffer, video::ISemaphore::SWaitInfo waitInfo, float* cameraMat3x4);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take the camera MVP as const hlsl::float32_t4x4&


bool render(video::IGPUCommandBuffer* commandBuffer, video::ISemaphore::SWaitInfo waitInfo, float* cameraMat3x4);

static std::array<hlsl::float32_t3, 24> getVerticesFromAABB(const core::aabbox3d<float>& aabb);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you need it for?


static std::array<hlsl::float32_t3, 24> getVerticesFromAABB(const core::aabbox3d<float>& aabb);

void addAABB(const core::aabbox3d<float>& aabb, const hlsl::float32_t4& color = { 1,0,0,1 });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't provide legacy API with legacy types

void addAABB(const core::aabbox3d<float>& aabb, const hlsl::float32_t4& color = { 1,0,0,1 });
void addAABB(const hlsl::shapes::AABB<3,float>& aabb, const hlsl::float32_t4& color = { 1,0,0,1 });

void addOBB(const hlsl::shapes::AABB<3, float>& aabb, const hlsl::float32_t3x4 transform, const hlsl::float32_t4& color = { 1,0,0,1 });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the transform shall be the OBB, adding an AABB into the mix is confusing

static bool createStreamingBuffer(SCreationParameters& params);

std::vector<debugdraw::InstanceData> m_instances;
std::array<hlsl::float32_t3, 24> m_unitAABBVertices;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indexed line list takes 8 verts

static core::smart_refctd_ptr<video::IGPUGraphicsPipeline> createPipeline(SCreationParameters& params);
static bool createStreamingBuffer(SCreationParameters& params);

std::vector<debugdraw::InstanceData> m_instances;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

immediate mode for GUI is ok, but I'd like to avoid it here, get rid of add and make render take a span<const InstanceData>

std::chrono::steady_clock::time_point waitTill = std::chrono::steady_clock::now() + std::chrono::milliseconds(1u);
streaming->multi_allocate(waitTill, 1, &inputOffset, &totalSize, &MaxAlignment);

memcpy(streamingPtr + vertexByteOffset, m_unitAABBVertices.data(), sizeof(m_unitAABBVertices[0]) * m_unitAABBVertices.size());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a small buffer at startup containing the indices (please draw indexed) so you're not copying it in all the time

P.S. the vertices should be a const float32_t3[8] array in the vertex shader

Comment on lines +300 to +337
std::array<float32_t3, 24> DrawAABB::getVerticesFromAABB(const core::aabbox3d<float>& aabb)
{
const auto& pMin = aabb.MinEdge;
const auto& pMax = aabb.MaxEdge;

std::array<float32_t3, 24> vertices;
vertices[0] = float32_t3(pMin.X, pMin.Y, pMin.Z);
vertices[1] = float32_t3(pMax.X, pMin.Y, pMin.Z);
vertices[2] = float32_t3(pMin.X, pMin.Y, pMin.Z);
vertices[3] = float32_t3(pMin.X, pMin.Y, pMax.Z);

vertices[4] = float32_t3(pMax.X, pMin.Y, pMax.Z);
vertices[5] = float32_t3(pMax.X, pMin.Y, pMin.Z);
vertices[6] = float32_t3(pMax.X, pMin.Y, pMax.Z);
vertices[7] = float32_t3(pMin.X, pMin.Y, pMax.Z);

vertices[8] = float32_t3(pMin.X, pMax.Y, pMin.Z);
vertices[9] = float32_t3(pMax.X, pMax.Y, pMin.Z);
vertices[10] = float32_t3(pMin.X, pMax.Y, pMin.Z);
vertices[11] = float32_t3(pMin.X, pMax.Y, pMax.Z);

vertices[12] = float32_t3(pMax.X, pMax.Y, pMax.Z);
vertices[13] = float32_t3(pMax.X, pMax.Y, pMin.Z);
vertices[14] = float32_t3(pMax.X, pMax.Y, pMax.Z);
vertices[15] = float32_t3(pMin.X, pMax.Y, pMax.Z);

vertices[16] = float32_t3(pMin.X, pMin.Y, pMin.Z);
vertices[17] = float32_t3(pMin.X, pMax.Y, pMin.Z);
vertices[18] = float32_t3(pMax.X, pMin.Y, pMin.Z);
vertices[19] = float32_t3(pMax.X, pMax.Y, pMin.Z);

vertices[20] = float32_t3(pMin.X, pMin.Y, pMax.Z);
vertices[21] = float32_t3(pMin.X, pMax.Y, pMax.Z);
vertices[22] = float32_t3(pMax.X, pMin.Y, pMax.Z);
vertices[23] = float32_t3(pMax.X, pMax.Y, pMax.Z);

return vertices;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function not needed, you shall store this array in vertex shader and index it with gl_VertexIndex

Comment on lines +339 to +367
void DrawAABB::addAABB(const core::aabbox3d<float>& aabb, const hlsl::float32_t4& color)
{
addAABB(shapes::AABB<3, float>{{aabb.MinEdge.X, aabb.MinEdge.Y, aabb.MinEdge.Z}, { aabb.MaxEdge.X, aabb.MaxEdge.Y, aabb.MaxEdge.Z }}, color);
}

void DrawAABB::addAABB(const hlsl::shapes::AABB<3,float>& aabb, const hlsl::float32_t4& color)
{
const auto transform = hlsl::float32_t3x4(1);
addOBB(aabb, transform, color);
}

void DrawAABB::addOBB(const hlsl::shapes::AABB<3, float>& aabb, const hlsl::float32_t3x4 transform, const hlsl::float32_t4& color)
{
InstanceData instance;
instance.color = color;

core::matrix3x4SIMD instanceTransform;
instanceTransform.setTranslation(core::vectorSIMDf(aabb.minVx.x, aabb.minVx.y, aabb.minVx.z, 0));
const auto diagonal = aabb.getExtent();
instanceTransform.setScale(core::vectorSIMDf(diagonal.x, diagonal.y, diagonal.z));

core::matrix3x4SIMD worldTransform;
memcpy(worldTransform.pointer(), &transform, sizeof(transform));

instanceTransform = core::concatenateBFollowedByA(worldTransform, instanceTransform);
memcpy(instance.transform, instanceTransform.pointer(), sizeof(core::matrix3x4SIMD));

m_instances.push_back(instance);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have different constructors or factories of InstanceData from transform, aabb, color and camera mvp than this immediate mode non-threadsafe (one hidden implicit vector) thing

Comment on lines +236 to +242
bool DrawAABB::renderSingle(IGPUCommandBuffer* commandBuffer)
{
commandBuffer->setLineWidth(1.f);
commandBuffer->draw(24, 1, 0, 0);

return true;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function kinda makes no sense, you'd need to have a separate pipeline which takes the instance data as a push constant

here to draw a single AABB I have to jump through more hoops than calling DrawAABB::render

Comment on lines +246 to +251
using offset_t = SCachedCreationParameters::streaming_buffer_t::size_type;
constexpr auto MdiSizes = std::to_array<offset_t>({ sizeof(float32_t3), sizeof(InstanceData) });
// shared nPoT alignment needs to be divisible by all smaller ones to satisfy an allocation from all
constexpr offset_t MaxAlignment = std::reduce(MdiSizes.begin(), MdiSizes.end(), 1, [](const offset_t a, const offset_t b)->offset_t {return std::lcm(a, b); });
// allocator initialization needs us to round up to PoT
const auto MaxPOTAlignment = roundUpToPoT(MaxAlignment);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you shall only allocate the instance data in the MDI buffer, it will make all the logic here a lot simpler!

Index buffer must be premade and constant at create time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants